Goto

Collaborating Authors

 topographic transformation


Topographic Transformation as a Discrete Latent Variable

Neural Information Processing Systems

Invariance to topographic transformations such as translation and shearing in an image has been successfully incorporated into feed(cid:173) forward mechanisms, e.g., "convolutional neural networks", "tan(cid:173) gent propagation". We describe a way to add transformation invari(cid:173) ance to a generative density model by approximating the nonlinear transformation manifold by a discrete set of transformations. An EM algorithm for the original model can be extended to the new model by computing expectations over the set of transformations. We show how to add a discrete transformation variable to Gaussian mixture modeling, factor analysis and mixtures of factor analysis. We give results on filtering microscopy images, face and facial pose clustering, and handwritten digit modeling and recognition.


Topographic Transformation as a Discrete Latent Variable

Neural Information Processing Systems

A very small amount of shearing will move the point only slightly, so deforming the object by shearing will trace a continuous curve in the space of pixel intensities. As illustrated in Fig. la, extensive levels of shearing will produce a highly nonlinear curve (consider shearing a thin vertical line), although the curve can be approximated by a straight line locally. Linear approximations of the transformation manifold have been used to significantly improve the performance of feedforward discriminative classifiers such as nearest neighbors (Simard et al., 1993) and multilayer perceptrons (Simard et al., 1992). Linear generative models (factor analysis, mixtures of factor analysis) have also been modified using linear approximations of the transformation manifold to build in some degree of transformation invariance (Hinton et al., 1997). In general, the linear approximation is accurate for transformations that couple neighboring pixels, but is inaccurate for transformations that couple nonneighboring pixels. In some applications (e.g., handwritten digit recognition), the input can be blurred so that the linear approximation becomes more robust. For significant levels of transformation, the nonlinear manifold can be better modeled using a discrete approximation. For example, the curve in Figure 1a can be 478 N. Jojic and B. J. Frey


Topographic Transformation as a Discrete Latent Variable

Neural Information Processing Systems

We describe a way to add transformation invariance toa generative density model by approximating the nonlinear transformation manifold by a discrete set of transformations. An EM algorithm for the original model can be extended to the new model by computing expectations over the set of transformations. We show how to add a discrete transformation variable to Gaussian mixture modeling, factor analysis and mixtures of factor analysis. We give results on filtering microscopy images, face and facial pose clustering, and handwritten digit modeling and recognition.